Conversation
|
I've finished the main part of this PR. But I'm not sure the best way to implement the |
|
@Fullstop000 Can you please include links to any design docs you worked on? |
|
@Hoverbear Can I make a WIP PR in tikv/rfc ? |
|
@Fullstop000 I think it's OK. :) I'd still really appreciate seeing an RFC (like https://docs.google.com/document/d/1Sp9Tnc_nk_i0feTOgLGPT3jZYVFjfP7C6pwA-wuVHko/edit?ts=5c8c5fe3 which you worked on) |
ae47fee to
3502119
Compare
|
@hicqu PTAL |
3502119 to
0902e2f
Compare
|
|
||
| /// The max number of members of a group of which contains leader and the leader never pick a delegate. | ||
| /// If the group size is larger than this, a delegate will be picked even the leader belongs to this group. | ||
| pub max_leader_group_no_delegate: usize, |
There was a problem hiding this comment.
Personally I prefer to remove this, because without explicit information about group_id, follower replication may be not helpful. For example 7 peers in one datacenter, follower replication can't save any network traffics.
| // Also, A peer with smaller 'match' is able to receive more un-compacted entries from leader | ||
| // and then send them to others in same group. | ||
| // | ||
| fn pick_delegate(&self, group: &[u64], prs: &ProgressSet) -> u64 { |
There was a problem hiding this comment.
Seems in the new implementation, choose the peer with most raft logs is better?
There was a problem hiding this comment.
It seems more reasonable. Actually maybe we can choose any unpaused peer?
| pub msgs: Vec<Message>, | ||
|
|
||
| /// The especial map for messages to be sent to group delegates. | ||
| pub delegated_msgs: HashMap<u64, Message>, |
There was a problem hiding this comment.
I prefer to put it with groups
| } | ||
|
|
||
| // Whether the given peer could use Follower Replication | ||
| fn use_delegate(&self, to: u64) -> bool { |
There was a problem hiding this comment.
I prefer
fn use_delegate(&self, to: u64) -> bool {
self.is_leader() &&
self.groups.get_members(to).map_or(0, |members| members.len()) > 1 &&
!self.groups.in_same_group(self.id, to)
}And we need to make sure that peers with invalid group_id shouldn't exist in groups.
| /// of produced message should be set to the leader id. | ||
| pub fn send_append(&mut self, to: u64, prs: &mut ProgressSet) { | ||
| if self.use_delegate(to) { | ||
| if let Some(gid) = self.groups.get_group_id(to) { |
There was a problem hiding this comment.
Too many indents. Maybe we can define use_delegate(to) -> Option<group_id>.
| None => self.pick_delegate(&members, prs), | ||
| }; | ||
| if delegate != INVALID_ID { | ||
| if let Some(pr) = prs.get_mut(delegate) { |
There was a problem hiding this comment.
I think calling unwrap is better, because the progress must exist.
| } | ||
| None => { | ||
| let members = self.groups.get_members(to).cloned().unwrap(); // this is safe because of the checking in `self.use_delegate` | ||
| let delegate = match self.groups.get_delegate(gid) { |
There was a problem hiding this comment.
I prefer let delegate = self.pick_delegate(prs), which already contains get_delegate logic. And, its valid only if the pr is not paused.
| *maybe_commit = true; | ||
| } | ||
|
|
||
| fn handle_append_response_in_delegate( |
There was a problem hiding this comment.
Seems only difference between this and handle_append_response is the latter needs to handle leader transferring. That logic is compatible for delegates. So I think this function is unnecessary.
| self.set_prs(prs); | ||
| } else { | ||
| let members = self.groups.get_members(m.from_delegate).cloned(); | ||
| self.bcast_append(members.as_ref()); |
There was a problem hiding this comment.
bcast_append calls send_append, which checks members' delegate again. I believe it can be improved, maybe we can take a try?
| self.read_states.push(rs); | ||
| } | ||
| MessageType::MsgAppendResponse => { | ||
| if self.prs().get(m.from).is_none() { |
There was a problem hiding this comment.
It's a common case for receiving responses from removed peers. So I think the log is unnecessary. Without it the code could be cleaner.
| // | ||
| // The delegate must satisfy conditions below: | ||
| // 1. Must be 'recent_active' | ||
| // 2. The progress state should be 'Replicate' but not 'paused' |
There was a problem hiding this comment.
Seems Probe needs also to be allowed? I think calling is_paused here is fine.
There was a problem hiding this comment.
Probe might not be suitable for a delegate in the leader's view because the Probe member can probably just recover from a crash or is suffering the network partition.
| } | ||
|
|
||
| #[inline] | ||
| fn is_follower_replication_enabled(&self) -> bool { |
There was a problem hiding this comment.
I think it's not accurate. Every peer can't know follower replication is enabled or not. It just choose a strategy after receives followers' messages.
There was a problem hiding this comment.
We can add some comments to make this clear.
4259f29 to
8ed4829
Compare
b7d2176 to
5189128
Compare
88f583e to
f89086c
Compare
Signed-off-by: Fullstop000 <fullstop1005@gmail.com>
88f583e to
1da637c
Compare
Signed-off-by: Fullstop000 <fullstop1005@gmail.com>
|
Any updates? |
|
This PR is discontinued for so long, any updates here? Do we have a plan to get it merged? |
#136
New features
MessageGroupandDelegatein the cluster topologyA message now could be sent to a proxy and the proxy redirects the message to the destinationTODOs
Delegateof aGroupDelegatecanbcast_appendorsend_appendto the other members in theGroupProblems
The ConfChange protocol for raft groups might need to be discussed.Solved asGroupConfigis now volatile.The implementation has been ported to etcd: etcd-io/etcd#11455.